EXTRUCT: Using Deep Structural Information in XML Keyword Search

نویسندگان

  • Arash Termehchy
  • Marianne Winslett
چکیده

Users who are unfamiliar with database query languages can search XML data sets using keyword queries. Previous work has shown that current XML keyword search methods, although intuitive, do not effectively use the data’s structural information and provide poor precision, recall, and ranking for most queries. Based on an extension of the concept of information theory, we have developed principled frameworks called normalized total correlation (NTC) and normalized term presence correlation (NTPC) to measure the relevance of candidate answers to keyword queries. We demonstrate EXTRUCT, an XML keyword search interface that uses NTC and NTPC. An extensive empirical evaluation over two real-world XML DBs has shown that EXTRUCT has better precision and recall and provides better ranking than all previous approaches. We demonstrate EXTRUCT, along with seven other keyword search systems for four real-world XML data sets, using prepared queries as well as queries from the audience. The demonstration shows that using deep structural information increases the effectiveness of XML keyword search systems considerably.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

Structural Feedback for Keyword-Based XML Retrieval

Keyword-based queries are an important means to retrieve information from XML collections with unknown or complex schemas. Relevance Feedback integrates relevance information provided by a user to enhance retrieval quality. For keyword-based XML queries, feedback engines usually generate an expanded keyword query from the content of elements marked as relevant or nonrelevant. This approach that...

متن کامل

Relevance Feedback for Structural Query Expansion

Keyword-based queries are an important means to retrieve information from XML collections with unknown or complex schemas. Relevance Feedback integrates relevance information provided by a user to enhance retrieval quality. For keyword-based XML queries, feedback engines usually generate an expanded keyword query from the content of elements marked as relevant or nonrelevant. This approach that...

متن کامل

SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents

Keyword search in XML documents has recently gained a lot of research attention. Given a keyword query, existing approaches first compute the lowest common ancestors (LCAs) or their variants of XML elements that contain the input keywords, and then identify the subtrees rooted at the LCAs as the answer. In this the paper we study how to use the rich structural relationships embedded in XML docu...

متن کامل

Keyword Search in XML Database with Relevance Ranking and Maintaining Stored Websites

Keyword search in XML Database is to provide access on XML database by overcoming keyword ambiguity. Here users are allowed to search on XML database using keyword search like Text Databases. A novel IR style approach which well captures XML’s hierarchical structure, and works well on pure keyword query independent of any schema information of XML data. A search engine prototype called XReal is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2010